Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 344 |
| Missing cells | 2752 |
| Missing cells (%) | 36.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 82.1 KiB |
| Average record size in memory | 244.4 B |
Variable types
| Numeric | 10 |
|---|---|
| Unsupported | 8 |
| Categorical | 4 |
operation_car has constant value "29.0" | Constant |
destination_esr is highly correlated with operation_st_esr | High correlation |
operation_st_esr is highly correlated with destination_esr | High correlation |
operation_date is highly correlated with operation_car and 1 other fields | High correlation |
rodvag is highly correlated with operation_car | High correlation |
operation_car is highly correlated with operation_date and 2 other fields | High correlation |
adm is highly correlated with operation_date and 1 other fields | High correlation |
index_train has 344 (100.0%) missing values | Missing |
danger has 344 (100.0%) missing values | Missing |
loaded has 344 (100.0%) missing values | Missing |
operation_train has 344 (100.0%) missing values | Missing |
rod_train has 344 (100.0%) missing values | Missing |
ssp_station_esr has 344 (100.0%) missing values | Missing |
ssp_station_id has 344 (100.0%) missing values | Missing |
weight_brutto has 344 (100.0%) missing values | Missing |
df_index has unique values | Unique |
index_train is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
danger is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
loaded is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
operation_train is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
rod_train is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
ssp_station_esr is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
ssp_station_id is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
weight_brutto is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
receiver has 17 (4.9%) zeros | Zeros |
sender has 18 (5.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-04-16 09:05:16.524724 |
|---|---|
| Analysis finished | 2021-04-16 09:05:35.818698 |
| Duration | 19.29 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 344 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2233827.663 |
|---|---|
| Minimum | 64867 |
| Maximum | 4043546 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 64867 |
|---|---|
| 5-th percentile | 72016.05 |
| Q1 | 1215677.5 |
| median | 2392990 |
| Q3 | 3880534.75 |
| 95-th percentile | 4039605.25 |
| Maximum | 4043546 |
| Range | 3978679 |
| Interquartile range (IQR) | 2664857.25 |
Descriptive statistics
| Standard deviation | 1339825.287 |
|---|---|
| Coefficient of variation (CV) | 0.5997890121 |
| Kurtosis | -1.107076368 |
| Mean | 2233827.663 |
| Median Absolute Deviation (MAD) | 1187245 |
| Skewness | -0.1572587093 |
| Sum | 768436716 |
| Variance | 1.7951318 × 1012 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 71680 | 1 | 0.3% |
| 73568 | 1 | 0.3% |
| 4039606 | 1 | 0.3% |
| 3453417 | 1 | 0.3% |
| 3882400 | 1 | 0.3% |
| 3880807 | 1 | 0.3% |
| 73572 | 1 | 0.3% |
| 64867 | 1 | 0.3% |
| 73570 | 1 | 0.3% |
| 1925008 | 1 | 0.3% |
| Other values (334) | 334 |
| Value | Count | Frequency (%) |
| 64867 | 1 | |
| 71214 | 1 | |
| 71282 | 1 | |
| 71288 | 1 | |
| 71680 | 1 | |
| 71684 | 1 | |
| 71686 | 1 | |
| 71688 | 1 | |
| 71690 | 1 | |
| 71696 | 1 |
| Value | Count | Frequency (%) |
| 4043546 | 1 | |
| 4043540 | 1 | |
| 4043334 | 1 | |
| 4043330 | 1 | |
| 4043303 | 1 | |
| 4043296 | 1 | |
| 4043289 | 1 | |
| 4043282 | 1 | |
| 4043223 | 1 | |
| 4041761 | 1 |
length
Real number (ℝ≥0)
| Distinct | 10 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8820930233 |
|---|---|
| Minimum | 0.79 |
| Maximum | 1.26 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 0.79 |
|---|---|
| 5-th percentile | 0.83 |
| Q1 | 0.83 |
| median | 0.83 |
| Q3 | 0.83 |
| 95-th percentile | 1.06 |
| Maximum | 1.26 |
| Range | 0.47 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1147299151 |
|---|---|
| Coefficient of variation (CV) | 0.130065551 |
| Kurtosis | 3.164683975 |
| Mean | 0.8820930233 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.037784078 |
| Sum | 303.44 |
| Variance | 0.01316295342 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.83 | 265 | |
| 1.06 | 41 | 11.9% |
| 1.26 | 13 | 3.8% |
| 0.79 | 9 | 2.6% |
| 1 | 7 | 2.0% |
| 1.03 | 3 | 0.9% |
| 0.86 | 2 | 0.6% |
| 1.25 | 2 | 0.6% |
| 1.01 | 1 | 0.3% |
| 1.22 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 0.79 | 9 | 2.6% |
| 0.83 | 265 | |
| 0.86 | 2 | 0.6% |
| 1 | 7 | 2.0% |
| 1.01 | 1 | 0.3% |
| 1.03 | 3 | 0.9% |
| 1.06 | 41 | 11.9% |
| 1.22 | 1 | 0.3% |
| 1.25 | 2 | 0.6% |
| 1.26 | 13 | 3.8% |
| Value | Count | Frequency (%) |
| 1.26 | 13 | 3.8% |
| 1.25 | 2 | 0.6% |
| 1.22 | 1 | 0.3% |
| 1.06 | 41 | 11.9% |
| 1.03 | 3 | 0.9% |
| 1.01 | 1 | 0.3% |
| 1 | 7 | 2.0% |
| 0.86 | 2 | 0.6% |
| 0.83 | 265 | |
| 0.79 | 9 | 2.6% |
car_number
Real number (ℝ≥0)
| Distinct | 291 |
|---|---|
| Distinct (%) | 84.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33081874.29 |
|---|---|
| Minimum | 30000491 |
| Maximum | 64046667 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 30000491 |
|---|---|
| 5-th percentile | 30075811.55 |
| Q1 | 30695415 |
| median | 30848452.5 |
| Q3 | 30883327.25 |
| 95-th percentile | 42255982.8 |
| Maximum | 64046667 |
| Range | 34046176 |
| Interquartile range (IQR) | 187912.25 |
Descriptive statistics
| Standard deviation | 6006289.835 |
|---|---|
| Coefficient of variation (CV) | 0.1815583296 |
| Kurtosis | 10.18321082 |
| Mean | 33081874.29 |
| Median Absolute Deviation (MAD) | 42509 |
| Skewness | 3.04798723 |
| Sum | 1.138016476 × 1010 |
| Variance | 3.607551759 × 1013 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 30840409 | 2 | 0.6% |
| 30853360 | 2 | 0.6% |
| 30853188 | 2 | 0.6% |
| 30251995 | 2 | 0.6% |
| 30817472 | 2 | 0.6% |
| 30814537 | 2 | 0.6% |
| 30840516 | 2 | 0.6% |
| 30840466 | 2 | 0.6% |
| 30684070 | 2 | 0.6% |
| 30253397 | 2 | 0.6% |
| Other values (281) | 324 |
| Value | Count | Frequency (%) |
| 30000491 | 1 | |
| 30008791 | 1 | |
| 30009898 | 1 | |
| 30013395 | 1 | |
| 30013692 | 1 | |
| 30017099 | 1 | |
| 30017693 | 1 | |
| 30018998 | 1 | |
| 30019392 | 1 | |
| 30019699 | 1 |
| Value | Count | Frequency (%) |
| 64046667 | 1 | |
| 63841308 | 1 | |
| 63144927 | 1 | |
| 61274643 | 1 | |
| 60302023 | 1 | |
| 58960428 | 1 | |
| 58960311 | 1 | |
| 56846975 | 1 | |
| 56838634 | 1 | |
| 52564887 | 1 |
| Distinct | 8 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 915003.2035 |
|---|---|
| Minimum | 843200 |
| Maximum | 988109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 843200 |
|---|---|
| 5-th percentile | 843200 |
| Q1 | 904705 |
| median | 904705 |
| Q3 | 918407 |
| 95-th percentile | 988109 |
| Maximum | 988109 |
| Range | 144909 |
| Interquartile range (IQR) | 13702 |
Descriptive statistics
| Standard deviation | 33388.80259 |
|---|---|
| Coefficient of variation (CV) | 0.03649036688 |
| Kurtosis | 1.247045576 |
| Mean | 915003.2035 |
| Median Absolute Deviation (MAD) | 13702 |
| Skewness | 0.4380156003 |
| Sum | 314761102 |
| Variance | 1114812138 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 904705 | 159 | |
| 918407 | 98 | |
| 988109 | 38 | 11.0% |
| 843200 | 20 | 5.8% |
| 946801 | 17 | 4.9% |
| 853005 | 10 | 2.9% |
| 964809 | 1 | 0.3% |
| 906503 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 843200 | 20 | 5.8% |
| 853005 | 10 | 2.9% |
| 904705 | 159 | |
| 906503 | 1 | 0.3% |
| 918407 | 98 | |
| 946801 | 17 | 4.9% |
| 964809 | 1 | 0.3% |
| 988109 | 38 | 11.0% |
| Value | Count | Frequency (%) |
| 988109 | 38 | 11.0% |
| 964809 | 1 | 0.3% |
| 946801 | 17 | 4.9% |
| 918407 | 98 | |
| 906503 | 1 | 0.3% |
| 904705 | 159 | |
| 853005 | 10 | 2.9% |
| 843200 | 20 | 5.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.6 KiB |
| 20.0 | |
|---|---|
| 33.0 | 17 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1376 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20.0 |
|---|---|
| 2nd row | 20.0 |
| 3rd row | 20.0 |
| 4th row | 20.0 |
| 5th row | 20.0 |
| Value | Count | Frequency (%) |
| 20.0 | 327 | |
| 33.0 | 17 | 4.9% |
| Value | Count | Frequency (%) |
| 20.0 | 327 | |
| 33.0 | 17 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 671 | |
| . | 344 | |
| 2 | 327 | |
| 3 | 34 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1032 | |
| Other Punctuation | 344 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 671 | |
| 2 | 327 | |
| 3 | 34 | 3.3% |
| Value | Count | Frequency (%) |
| . | 344 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1376 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 671 | |
| . | 344 | |
| 2 | 327 | |
| 3 | 34 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1376 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 671 | |
| . | 344 | |
| 2 | 327 | |
| 3 | 34 | 2.5% |
gruz
Real number (ℝ≥0)
| Distinct | 13 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 254680.6831 |
|---|---|
| Minimum | 233010 |
| Maximum | 999993 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 233010 |
|---|---|
| 5-th percentile | 236038 |
| Q1 | 236038 |
| median | 236038 |
| Q3 | 236038 |
| 95-th percentile | 321067 |
| Maximum | 999993 |
| Range | 766983 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 56329.7644 |
|---|---|
| Coefficient of variation (CV) | 0.221178001 |
| Kurtosis | 91.32038154 |
| Mean | 254680.6831 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.757277469 |
| Sum | 87610155 |
| Variance | 3173042357 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 236038 | 274 | |
| 321067 | 40 | 11.6% |
| 303069 | 12 | 3.5% |
| 242128 | 7 | 2.0% |
| 421161 | 2 | 0.6% |
| 233010 | 2 | 0.6% |
| 351306 | 1 | 0.3% |
| 542188 | 1 | 0.3% |
| 411155 | 1 | 0.3% |
| 411263 | 1 | 0.3% |
| Other values (3) | 3 | 0.9% |
| Value | Count | Frequency (%) |
| 233010 | 2 | 0.6% |
| 236038 | 274 | |
| 242128 | 7 | 2.0% |
| 302032 | 1 | 0.3% |
| 303069 | 12 | 3.5% |
| 321067 | 40 | 11.6% |
| 351306 | 1 | 0.3% |
| 411155 | 1 | 0.3% |
| 411263 | 1 | 0.3% |
| 421161 | 2 | 0.6% |
| Value | Count | Frequency (%) |
| 999993 | 1 | 0.3% |
| 542188 | 1 | 0.3% |
| 435060 | 1 | 0.3% |
| 421161 | 2 | 0.6% |
| 411263 | 1 | 0.3% |
| 411155 | 1 | 0.3% |
| 351306 | 1 | 0.3% |
| 321067 | 40 | |
| 303069 | 12 | 3.5% |
| 302032 | 1 | 0.3% |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.6 KiB |
| 29.0 |
|---|
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1376 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 29.0 |
|---|---|
| 2nd row | 29.0 |
| 3rd row | 29.0 |
| 4th row | 29.0 |
| 5th row | 29.0 |
| Value | Count | Frequency (%) |
| 29.0 | 344 |
| Value | Count | Frequency (%) |
| 29.0 | 344 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 344 | |
| 9 | 344 | |
| . | 344 | |
| 0 | 344 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1032 | |
| Other Punctuation | 344 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 2 | 344 | |
| 9 | 344 | |
| 0 | 344 |
| Value | Count | Frequency (%) |
| . | 344 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1376 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 2 | 344 | |
| 9 | 344 | |
| . | 344 | |
| 0 | 344 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1376 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 2 | 344 | |
| 9 | 344 | |
| . | 344 | |
| 0 | 344 |
| Distinct | 21 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.7 KiB |
| 2020-07-29 05:05:00 | |
|---|---|
| 2020-07-17 05:03:00 | |
| 2020-07-14 13:55:00 | |
| 2020-07-15 22:14:00 | |
| 2020-07-27 17:58:00 | |
| Other values (16) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 6536 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | 1.2% |
Sample
| 1st row | 2020-07-17 05:03:00 |
|---|---|
| 2nd row | 2020-07-17 05:03:00 |
| 3rd row | 2020-07-17 05:03:00 |
| 4th row | 2020-07-17 05:03:00 |
| 5th row | 2020-07-17 05:03:00 |
| Value | Count | Frequency (%) |
| 2020-07-29 05:05:00 | 53 | |
| 2020-07-17 05:03:00 | 53 | |
| 2020-07-14 13:55:00 | 38 | |
| 2020-07-15 22:14:00 | 38 | |
| 2020-07-27 17:58:00 | 37 | |
| 2020-07-25 18:53:00 | 22 | |
| 2020-07-20 12:10:00 | 20 | 5.8% |
| 2020-07-30 17:05:00 | 20 | 5.8% |
| 2020-07-23 13:00:00 | 16 | 4.7% |
| 2020-07-09 05:49:00 | 14 | 4.1% |
| Other values (11) | 33 |
| Value | Count | Frequency (%) |
| 2020-07-29 | 53 | 7.7% |
| 05:05:00 | 53 | 7.7% |
| 05:03:00 | 53 | 7.7% |
| 2020-07-17 | 53 | 7.7% |
| 2020-07-14 | 53 | 7.7% |
| 2020-07-27 | 40 | 5.8% |
| 22:14:00 | 38 | 5.5% |
| 2020-07-15 | 38 | 5.5% |
| 13:55:00 | 38 | 5.5% |
| 17:58:00 | 37 | 5.4% |
| Other values (25) | 232 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2101 | |
| 2 | 960 | |
| - | 688 | 10.5% |
| : | 688 | 10.5% |
| 7 | 494 | 7.6% |
| 5 | 416 | 6.4% |
| 1 | 393 | 6.0% |
| 344 | 5.3% | |
| 3 | 194 | 3.0% |
| 4 | 112 | 1.7% |
| Other values (2) | 146 | 2.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4816 | |
| Dash Punctuation | 688 | 10.5% |
| Other Punctuation | 688 | 10.5% |
| Space Separator | 344 | 5.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 2101 | |
| 2 | 960 | |
| 7 | 494 | 10.3% |
| 5 | 416 | 8.6% |
| 1 | 393 | 8.2% |
| 3 | 194 | 4.0% |
| 4 | 112 | 2.3% |
| 9 | 81 | 1.7% |
| 8 | 65 | 1.3% |
| Value | Count | Frequency (%) |
| - | 688 |
| Value | Count | Frequency (%) |
| 344 |
| Value | Count | Frequency (%) |
| : | 688 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6536 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 2101 | |
| 2 | 960 | |
| - | 688 | 10.5% |
| : | 688 | 10.5% |
| 7 | 494 | 7.6% |
| 5 | 416 | 6.4% |
| 1 | 393 | 6.0% |
| 344 | 5.3% | |
| 3 | 194 | 3.0% |
| 4 | 112 | 1.7% |
| Other values (2) | 146 | 2.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6536 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 2101 | |
| 2 | 960 | |
| - | 688 | 10.5% |
| : | 688 | 10.5% |
| 7 | 494 | 7.6% |
| 5 | 416 | 6.4% |
| 1 | 393 | 6.0% |
| 344 | 5.3% | |
| 3 | 194 | 3.0% |
| 4 | 112 | 1.7% |
| Other values (2) | 146 | 2.2% |
| Distinct | 8 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 915003.2035 |
|---|---|
| Minimum | 843200 |
| Maximum | 988109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 843200 |
|---|---|
| 5-th percentile | 843200 |
| Q1 | 904705 |
| median | 904705 |
| Q3 | 918407 |
| 95-th percentile | 988109 |
| Maximum | 988109 |
| Range | 144909 |
| Interquartile range (IQR) | 13702 |
Descriptive statistics
| Standard deviation | 33388.80259 |
|---|---|
| Coefficient of variation (CV) | 0.03649036688 |
| Kurtosis | 1.247045576 |
| Mean | 915003.2035 |
| Median Absolute Deviation (MAD) | 13702 |
| Skewness | 0.4380156003 |
| Sum | 314761102 |
| Variance | 1114812138 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 904705 | 159 | |
| 918407 | 98 | |
| 988109 | 38 | 11.0% |
| 843200 | 20 | 5.8% |
| 946801 | 17 | 4.9% |
| 853005 | 10 | 2.9% |
| 964809 | 1 | 0.3% |
| 906503 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 843200 | 20 | 5.8% |
| 853005 | 10 | 2.9% |
| 904705 | 159 | |
| 906503 | 1 | 0.3% |
| 918407 | 98 | |
| 946801 | 17 | 4.9% |
| 964809 | 1 | 0.3% |
| 988109 | 38 | 11.0% |
| Value | Count | Frequency (%) |
| 988109 | 38 | 11.0% |
| 964809 | 1 | 0.3% |
| 946801 | 17 | 4.9% |
| 918407 | 98 | |
| 906503 | 1 | 0.3% |
| 904705 | 159 | |
| 853005 | 10 | 2.9% |
| 843200 | 20 | 5.8% |
operation_st_id
Real number (ℝ≥0)
| Distinct | 8 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2000201913 |
|---|---|
| Minimum | 2000036238 |
| Maximum | 2001930738 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 2000036238 |
|---|---|
| 5-th percentile | 2000036238 |
| Q1 | 2000036238 |
| median | 2000036472 |
| Q3 | 2000036820 |
| 95-th percentile | 2001930660 |
| Maximum | 2001930738 |
| Range | 1894500 |
| Interquartile range (IQR) | 581.5 |
Descriptive statistics
| Standard deviation | 535138.8186 |
|---|---|
| Coefficient of variation (CV) | 0.0002675423991 |
| Kurtosis | 6.676191807 |
| Mean | 2000201913 |
| Median Absolute Deviation (MAD) | 234 |
| Skewness | 2.938941154 |
| Sum | 6.880694582 × 1011 |
| Variance | 2.863735552 × 1011 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2000036238 | 159 | |
| 2000036472 | 98 | |
| 2000039028 | 38 | 11.0% |
| 2001930660 | 20 | 5.8% |
| 2000037862 | 17 | 4.9% |
| 2001930738 | 10 | 2.9% |
| 2000038510 | 1 | 0.3% |
| 2000036274 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 2000036238 | 159 | |
| 2000036274 | 1 | 0.3% |
| 2000036472 | 98 | |
| 2000037862 | 17 | 4.9% |
| 2000038510 | 1 | 0.3% |
| 2000039028 | 38 | 11.0% |
| 2001930660 | 20 | 5.8% |
| 2001930738 | 10 | 2.9% |
| Value | Count | Frequency (%) |
| 2001930738 | 10 | 2.9% |
| 2001930660 | 20 | 5.8% |
| 2000039028 | 38 | 11.0% |
| 2000038510 | 1 | 0.3% |
| 2000037862 | 17 | 4.9% |
| 2000036472 | 98 | |
| 2000036274 | 1 | 0.3% |
| 2000036238 | 159 |
| Distinct | 7 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17083387.85 |
|---|---|
| Minimum | 0 |
| Maximum | 79311499 |
| Zeros | 17 |
| Zeros (%) | 4.9% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 11866783 |
| Q1 | 14999355 |
| median | 14999355 |
| Q3 | 14999355 |
| 95-th percentile | 58786880 |
| Maximum | 79311499 |
| Range | 79311499 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 11799436.96 |
|---|---|
| Coefficient of variation (CV) | 0.6906965443 |
| Kurtosis | 9.828865565 |
| Mean | 17083387.85 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.105661416 |
| Sum | 5876685422 |
| Variance | 1.392267125 × 1014 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 14999355 | 296 | |
| 58786880 | 20 | 5.8% |
| 0 | 17 | 4.9% |
| 11866783 | 7 | 2.0% |
| 15082601 | 2 | 0.6% |
| 68594560 | 1 | 0.3% |
| 79311499 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 17 | 4.9% |
| 11866783 | 7 | 2.0% |
| 14999355 | 296 | |
| 15082601 | 2 | 0.6% |
| 58786880 | 20 | 5.8% |
| 68594560 | 1 | 0.3% |
| 79311499 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 79311499 | 1 | 0.3% |
| 68594560 | 1 | 0.3% |
| 58786880 | 20 | 5.8% |
| 15082601 | 2 | 0.6% |
| 14999355 | 296 | |
| 11866783 | 7 | 2.0% |
| 0 | 17 | 4.9% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.6 KiB |
| 90.0 | |
|---|---|
| 40.0 | |
| 60.0 | 7 |
| 93.0 | 2 |
| 20.0 | 1 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1376 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | 90.0 |
|---|---|
| 2nd row | 90.0 |
| 3rd row | 90.0 |
| 4th row | 90.0 |
| 5th row | 90.0 |
| Value | Count | Frequency (%) |
| 90.0 | 293 | |
| 40.0 | 41 | 11.9% |
| 60.0 | 7 | 2.0% |
| 93.0 | 2 | 0.6% |
| 20.0 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 90.0 | 293 | |
| 40.0 | 41 | 11.9% |
| 60.0 | 7 | 2.0% |
| 93.0 | 2 | 0.6% |
| 20.0 | 1 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 686 | |
| . | 344 | |
| 9 | 295 | |
| 4 | 41 | 3.0% |
| 6 | 7 | 0.5% |
| 3 | 2 | 0.1% |
| 2 | 1 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1032 | |
| Other Punctuation | 344 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 686 | |
| 9 | 295 | |
| 4 | 41 | 4.0% |
| 6 | 7 | 0.7% |
| 3 | 2 | 0.2% |
| 2 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| . | 344 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1376 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 686 | |
| . | 344 | |
| 9 | 295 | |
| 4 | 41 | 3.0% |
| 6 | 7 | 0.5% |
| 3 | 2 | 0.1% |
| 2 | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1376 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 686 | |
| . | 344 | |
| 9 | 295 | |
| 4 | 41 | 3.0% |
| 6 | 7 | 0.5% |
| 3 | 2 | 0.1% |
| 2 | 1 | 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14763184.75 |
|---|---|
| Minimum | 0 |
| Maximum | 58786880 |
| Zeros | 18 |
| Zeros (%) | 5.2% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 710091.9 |
| Q1 | 4733946 |
| median | 14999355 |
| Q3 | 14999355 |
| 95-th percentile | 58786880 |
| Maximum | 58786880 |
| Range | 58786880 |
| Interquartile range (IQR) | 10265409 |
Descriptive statistics
| Standard deviation | 14251262.81 |
|---|---|
| Coefficient of variation (CV) | 0.9653244237 |
| Kurtosis | 4.439173207 |
| Mean | 14763184.75 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.230435416 |
| Sum | 5078535554 |
| Variance | 2.030984917 × 1014 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 14999355 | 190 | |
| 4733946 | 106 | |
| 58786880 | 20 | 5.8% |
| 0 | 18 | 5.2% |
| 55545896 | 7 | 2.0% |
| 52255181 | 2 | 0.6% |
| 57790594 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 18 | 5.2% |
| 4733946 | 106 | |
| 14999355 | 190 | |
| 52255181 | 2 | 0.6% |
| 55545896 | 7 | 2.0% |
| 57790594 | 1 | 0.3% |
| 58786880 | 20 | 5.8% |
| Value | Count | Frequency (%) |
| 58786880 | 20 | 5.8% |
| 57790594 | 1 | 0.3% |
| 55545896 | 7 | 2.0% |
| 52255181 | 2 | 0.6% |
| 14999355 | 190 | |
| 4733946 | 106 | |
| 0 | 18 | 5.2% |
tare_weight
Real number (ℝ≥0)
| Distinct | 21 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 230.9680233 |
|---|---|
| Minimum | 184 |
| Maximum | 550 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 KiB |
Quantile statistics
| Minimum | 184 |
|---|---|
| 5-th percentile | 184 |
| Q1 | 230 |
| median | 232 |
| Q3 | 237 |
| 95-th percentile | 240 |
| Maximum | 550 |
| Range | 366 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 30.28940333 |
|---|---|
| Coefficient of variation (CV) | 0.1311411117 |
| Kurtosis | 70.74989188 |
| Mean | 230.9680233 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 6.530775486 |
| Sum | 79453 |
| Variance | 917.4479541 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 232 | 90 | |
| 237 | 62 | |
| 230 | 62 | |
| 184 | 36 | 10.5% |
| 240 | 29 | 8.4% |
| 233 | 19 | 5.5% |
| 270 | 13 | 3.8% |
| 238 | 5 | 1.5% |
| 210 | 5 | 1.5% |
| 225 | 4 | 1.2% |
| Other values (11) | 19 | 5.5% |
| Value | Count | Frequency (%) |
| 184 | 36 | |
| 193 | 2 | 0.6% |
| 210 | 5 | 1.5% |
| 220 | 1 | 0.3% |
| 222 | 3 | 0.9% |
| 223 | 1 | 0.3% |
| 225 | 4 | 1.2% |
| 226 | 2 | 0.6% |
| 227 | 1 | 0.3% |
| 230 | 62 |
| Value | Count | Frequency (%) |
| 550 | 2 | 0.6% |
| 270 | 13 | 3.8% |
| 259 | 1 | 0.3% |
| 242 | 1 | 0.3% |
| 240 | 29 | |
| 239 | 2 | 0.6% |
| 238 | 5 | 1.5% |
| 237 | 62 | |
| 235 | 3 | 0.9% |
| 233 | 19 | 5.5% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | index_train | length | car_number | destination_esr | adm | danger | gruz | loaded | operation_car | operation_date | operation_st_esr | operation_st_id | operation_train | receiver | rodvag | rod_train | sender | ssp_station_esr | ssp_station_id | tare_weight | weight_brutto | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 64867 | NaN | 0.83 | 30842983 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 232.0 | NaN |
| 1 | 71214 | NaN | 0.83 | 30817472 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 240.0 | NaN |
| 2 | 71282 | NaN | 0.83 | 30824577 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 235.0 | NaN |
| 3 | 71288 | NaN | 0.83 | 30824825 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 238.0 | NaN |
| 4 | 71680 | NaN | 0.83 | 30840409 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 233.0 | NaN |
| 5 | 71684 | NaN | 0.83 | 30840508 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 233.0 | NaN |
| 6 | 71686 | NaN | 0.83 | 30840466 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 233.0 | NaN |
| 7 | 71688 | NaN | 0.83 | 30840607 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 233.0 | NaN |
| 8 | 71690 | NaN | 0.83 | 30840516 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 233.0 | NaN |
| 9 | 71696 | NaN | 0.83 | 30848378 | 904705.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-17 05:03:00 | 904705.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 4733946.0 | NaN | NaN | 230.0 | NaN |
Last rows
| df_index | index_train | length | car_number | destination_esr | adm | danger | gruz | loaded | operation_car | operation_date | operation_st_esr | operation_st_id | operation_train | receiver | rodvag | rod_train | sender | ssp_station_esr | ssp_station_id | tare_weight | weight_brutto | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 334 | 4041761 | NaN | 0.83 | 30813661 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 235.0 | NaN |
| 335 | 4043223 | NaN | 0.83 | 30885602 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 336 | 4043282 | NaN | 0.83 | 30883409 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 337 | 4043289 | NaN | 0.83 | 30883300 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 338 | 4043296 | NaN | 0.83 | 30883433 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 339 | 4043303 | NaN | 0.83 | 30883425 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 340 | 4043330 | NaN | 0.83 | 30891287 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 341 | 4043334 | NaN | 0.83 | 30891154 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 342 | 4043540 | NaN | 0.83 | 30891352 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |
| 343 | 4043546 | NaN | 0.83 | 30891303 | 918407.0 | 20.0 | NaN | 236038.0 | NaN | 29.0 | 2020-07-15 22:14:00 | 918407.0 | 2.000036e+09 | NaN | 14999355.0 | 90.0 | NaN | 14999355.0 | NaN | NaN | 232.0 | NaN |